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Abstract. Information-theoretic key agreement is impossible to achieve from scratch and must be based 
on some — ultimately physical — premise. In 2005, Barrett, Hardy, and Kent showed that unconditional 
security can be obtained in principle based on the impossibility of faster-than-light signaling; however, 
their protocol is inefficient and cannot tolerate any noise. While their key-distribution scheme uses 
quantum entanglement, its security only relies on the impossibility of superluminal signaling, rather 
than the correctness and completeness of quantum theory. In particular, the resulting security is device 
independent. Here we introduce a new protocol which is efficient in terms of both classical and quantum 
communication, and that can tolerate noise in the quantum channel. We prove that it offers device- 
independent security under the sole assumption that certain non-signaling conditions are satisfied. Our 
main insight is that the XOR of a number of bits that are partially secret according to the non-signaling 
conditions turns out to be highly secret. Note that similar statements have been well-known in classical 
contexts. Earlier results had indicated that amplification of such non-signaling-based privacy is impossible 
to achieve if the non-signaling condition only holds between events on Alice's and Bob's sides. Here, we 
show that the situation changes completely if such a separation is given within each of the laboratories. 

1 Introduction, Motivation, and Our Result 

1.1 Minimizing Assumptions for Information-Theoretic Key Agreement 

It is well-established that information-theoretic secrecy must be based on certain premises such as 
noise in communication channels [46], [IB], |34| . a limitation on an adversary's memory [33], |19| . 
or the uncertainty principle of quantum physics [JJ. In traditional quantum key distribution, the 
security proof is based on 

1. the postulates of quantum physics, 

2. the assumptions that the used devices transmit and operate on the specified quantum systems, 
and 

3. that Eve does not get information about the generated key out of the legitimate partners' labo- 
ratories. 

This article is concerned with a variant of quantum key distribution which allows the first two as- 
sumptions to be dropped, if at the same time, the third is augmented by the assumption that no 
unauthorized information is exchanged between the legitimate laboratories. One possibility to guar- 
antee this is via the non-signaling postulate of relativity, if different measurement events are carried 
out in a space-like separated way. Of particular importance is device independence (i.e., dropping 
condition [2j , for two reasons. First, the necessity to trust the manufacturer is never satisfactory. 
Second, the security of traditional protocols for quantum key distribution relies crucially on the fact 
that single Qbits (i.e., photons) are sent. For instance, the BB84 protocol [7] becomes completely 
insecure if larger systems, such as pairs of photons, are transmitted. With present technology, this 
is a significant issue. The fact that practical deviations from the theoretical model open the possi- 
bility of attacks has been demonstrated experimentally, see [22], [2T], [4T], [UJ, [33J, and references 
therein. 

The question of device-independent security has been raised by Mayers and Yao in [36J0 That 
such security is possible in principle follows from [5]; however, only a zero secret-key rate has been 

4 The work by Mayers and Yao initiated further investigation on how to test the correct working of quantum devices 
(not restricted to quantum cryptography) [45], [37] . |28| . 



achieved, and in addition the classical communication cost is exponential. Later schemes that are 
robust against noise and achieve a positive key rate have been proven secure against certain restricted 
types of attacks [3], [32], [2J, [JJ. The current state of the art is that security holds against arbitrary 
attacks, but no (quantum) correlation is introduced between subsequent measurements, see e.g., |38j . 

1.2 Relativity-Based Key Distribution 

It is possible to generate a secret key assuming only that information transmission faster than at 
the speed of light is impossible. The basic idea, as proposed by Barrett, Hardy, and Kent [5], is 
as follows: By communication over a quantum channel, two parties, Alice and Bob, generate some 
shared entangled quantum state. They carry out measurements in a space- like-separated way, i.e., 
no signaling is possible between the measurement events. Alice and Bob then verify the statistics 
of the measurement outcomes. Given that these satisfy certain specified properties, the privacy of 
the data follows directly from the correlations in the resulting data and is independent of whatever 
quantum systems the devices operate on. It is not even necessary to assume that the possibilities of 
what an adversary can do is limited by quantum physics: The latter guarantees the protocol to work 
(i.e., leads to the expected correlations, the occurrence of which can be verified), but the security is 
completely independent of it. A consequence is that protocols can be given which are secure if either 
quantum physics or relativity (or both, of course) is correct. 

How is it possible to derive secrecy directly from correlations? In quantum physics, this is well- 
known: Quantum correlations, called entanglement, are monogamous to some extent |44| : If Alice 
and Bob are maximally entangled, then Eve factors out and is independent. However, we do not 
know such an effect classically: If Alice and Bob have highly correlated bits, Eve can nevertheless 
know them. The point is that we have to look at the — so-called non-local — input-output behavior 
of systems. 

1.3 Systems, Correlations, and Non-Locality 

In order to explain non-local correlations, we introduce the notion of a two-party system, defined by 
its joint input-output behavior Pxy\uv ( see Figured]). 




Fig. 1. A two-party system. If it does not allow for message transmission, it is called a box. 



Definition 1. A system is a bi- (or more-) partite conditional probability distribution Pxy\uv- It 
is local if Pxy\uv = Sr=i w iPx\u^Y\v n °lds f° r some weights Wi > and conditional distributions 
P X \U and Py\y, i = l,...,n. A system is signaling if it allows for message transmission, i.e., it 

is non-signaling if J2 X Pxy\uv{ x i Hi u i v ) = Ylix^ > XY\uv{ x ->y-> u '-> v ) f° r au Vi v (arid similar with the 
roles of the interfaces exchanged). We call a non-signaling system a box. 

Lemma [1] states that locality is equivalent to the possibility that the outputs to alternative inputs 
are consistently pre-determined (see Figure [2]). 

Lemma 1. For any system Pxy\uv> where U and V are the ranges of U and V , respectively, the 
following conditions are equivalent: 

1. Pxy\uv i s local, 



2. there exist random variables X u (u G U) and Y v (v € V) with a joint distribution such that the 
marginals satisfy Px u Y v = Pxy\u=u,v=v 

Proof. Assume first that Pxy\uv is local, i.e., Pxy\uv = Yl w i^ > x\u^ > Y\v ^ or ^ = n2 ' • • • ' Um ^ 
and V = {vi,V2, ■ ■ ■ , v n }, define 

P X ul -X Um Y vl -Y Vn (xi, . . . ,X m ,yi, ■ ■ ■ ,Vn) ■= }^ WjPx\ U= U1 (xi) ' ' ' P X\U=u m {Xm) ' fy | (l/l) ' ' ' P Y \ V=v n (Vn) ■ 

This distribution has the desired property. 

To see the reverse direction, let X Ul ■ ■ ■ X Um Y Vl ■ ■ ■ Y Vn be the shared randomness w. □ 
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Fig. 2. Locality means that alternative outputs consistently coexist. 



We cryptographically exploit the contraposition of the statement: As soon as a system behaves non- 
locally, the outputs cannot exist before the input is given, i.e., the measurement is actually carried 
out. In particular, these outputs cannot have been stored in the devices previously, and they cannot 
be known to an adversary. 

1.4 Non-Locality Implies Secrecy 

In order to explain this idea more explicitly, let us consider a specific example of a system (see also 
Figure ED. 

Definition 2. |40] A Popescu-Rohrlich box (or PR box for short) is the following bipartite system 
Pxy\uv : For each input pair (u,v), the random variable X is a random bit and we have 

Prob [X © Y = U ■ V] = 1 . (1) 
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Fig. 3. The PR box. 



John Bell's theorem from 1964 [6] implies that this system is indeed non-local. More precisely, 
any system that behaves like a PR box with probability greater than 75% is. The reason is that 



the four conditions represented by (Q]) (one for each input combination) are contradictory, and only 
three can be satisfied at a time. Interestingly, when one is allowed to measure entangled quantum 
states, one can achieve roughly 85%. 

The type of non-locality characterized by the PR box is often called CHSH non-locality after |17j 
and we will sometimes call condition (fl]) CHSH condition. 

Note that the PR box is non-signaling: X and Y separately are perfectly random bits and 
independent of the input pair. On the other hand, a system Pxy\uv (where all variables are bits) 
satisfying (P) is non-signaling only if the outputs are completely unbiased, given the input pair, 
i.e., P x \u= u ,v=v(Q) = Py\u=u,v=v(®) = 1/2- In other words, the output bit can neither be pre- 
determined, nor slightly biased. Assume that Alice and Bob share any kind of physical system, carry 
out space- like separated measurements (hereby excluding message transmission), and measure data 
having the statistics of a PR box. The outputs must then be perfectly secret bits because even when 
conditioned on an adversary's complete information, the correlation between Alice and Bob must 
still be non-signaling and fulfill equation ([1]). 

Unfortunately, however, the behavior of perfect PR boxes does not occur in nature: Quantum 
physics is non-local, but not maximally so. Can we also obtain secret bits from weaker, quantum- 
physically achievable, non- locality? Barrett, Hardy, and Kent [5] have shown that the answer is yes. 
Their protocol is, however, inefficient: In order to reduce the probability that the adversary learns 
a generated bit shared by Alice and Bob below e, they have to communicate 0(l/e) Qbits. 

If we measure maximally entangled quantum states, we can get at most 85%-approximations 
to the PR-box's behavior. Fortunately, any non-locality implies some secrecy. In order to illustrate 
this, consider a system approximating a PR box with probability 1 — e for all inputs. More precisely, 
we have 

Prob [X Y = U ■ V\U = u, V = v } = 1 - e (2) 

for all (u,v) € {0,1} 2 . Then, what is the maximal possible bias p := Prob [X = 0\U = 0, V = 0] 
such that the system is non-signaling? 
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We explain the table: Because of ([2]), the bias of Y, given U = V = 0, must be at least p — e. 
Because of non-signaling, X's bias must be p as well when V = 1, and so on. Finally, condition ([2]) 
for U = V = 1 implies p — e — (1 — (p — 2e)) < e, hence, p < 1/2 + 2e. For any e < 1/4, this is a 
non-trivial bound. (This reflects the fact that e = 1/4 is the "local limit.") 

Conditioned on Eve's entire information, this reads: Weak non-locality means weak secrecy. 
Can it be amplified? Privacy amplification is a concept well-known from classical [9], [25], [8] and 
quantum [26j cryptography, and means transforming a weakly secret string into a highly secret key 
by hashing. These results are not applicable with respect to non-signaling privacy since this is a 
strictly stronger notion, i.e., the attacker has more possible courses of action^ In [23] it has been 
pessimistically argued that privacy amplification of non-signaling secrecy is impossible, the problem 
being that certain collective attacks exist that leave the adversary with significant information about 
the final key, however the latter is obtained from the raw key. 

5 The only restriction by which the possibilities of such an adversary are limited is the non-signaling condition. Non- 
signaling secrecy has been shown achievable under the additional assumption that the adversary can only attack 
each of the boxes separately [5], [35] , [2]. In general, however, an adversary may of course attack them jointly — 
this corresponds to a coherent attack. In quantum mechanics, three types of attacks — individual, collective, and 
coherent — are distinguished |12j . |10| . In an individual attack, the eavesdropper attacks and measures each 

system identically and independently; in a collective attack the adversary still attacks each system identically and 
independently, but can make a joint measurement; finally the most general attack is a coherent attack, where no 
restrictions apply. 



Fortunately, the situation changes completely when one assumes a non-signaling condition be- 
tween the individual measurements performed within Alice's as well as Bob's laboratories (see Fig- 
ure ED- This non-signaling condition could, for instance, be enforced by a space- like separation of the 
individual measurement events. In [29], Masanes has shown that in this case, privacy amplification 
is possible in principle — using as hash function a function chosen at random from the set of all 
functions^ Later, he has shown that it is sufficient to consider a two-universal set of functions (this 
proof is included in [32], Section IV.C). 



1.5 Main Result 

We show that there exists a protocol for efficiently generating a secret key, whose security is based 
on non-signaling conditions only (Theorem [3]). The protocol consists of measuring n copies of a max- 
imally entangled state, where all 2n measurement events are supposed to be space-like separated. 
Our result is distinct from Masanes' in the sense that we show a single explicit function, namely 
the XOR, to be a good privacy-amplification function. More precisely, we prove a lemma that the 
adversary's probability of correctly predicting the XOR of the outcomes of n boxes is exponentially 
(in n) close to 1/2 (see Lemma [6]). This can be seen as a generalization of the well-known fact that 
the XOR of many partially uniform bits is almost uniform and may be of independent interest. 



Since the security of our protocol, which is universally composable, is implied by the observed cor- 
relations alone, it is automatically device-independent. This means that nothing needs to be known 
about the internal workings of the quantum devices used for its implementation (such as photon 
sources or detectors) and their manufacturer need not be trusted. Moreover, a certain amount of 
noise can be tolerated: Our scheme has a positive key-generation rate whenever the correlations 
approximate PR boxes with an accuracy exceeding 80% and the output bits are correlated with 
more than 98% when Alice and Bob both choose to measure in the first basis (see Figure HJ) . 




Fig. 4. The parameter regions for which key agreement is possible (red), reachable by quantum 
mechanics (blue) and their intersection (green) . e is the probability of violating the CHSH condition 
(i.e., X (BY ^ U ■ V) for uniform inputs, and 5 the probability of not having the same output bits 
on input (0, 0). 



1.6 Outline 

The rest of our paper is organized as follows. In Section [21 we describe the model and the general set 
of possible strategies of a non-signaling adversary. In Section [3[ we motivate our security definition. 

6 Masanes' result implies that there exists a fixed function which can be used for privacy amplification, but the proof 
is non-constructive, i.e., the function cannot be given explicitly. 



Then we first consider the case of a single approximation of a PR box and give a tight bound on 
the adversarial knowledge on the outputs of such a box (Section HJ). We then proceed to the general 
case of n approximations of a PR box (Section ED . We show that the XOR of several output bits 
is as secure as when an adversary attacks each of the boxes independently and individually, hence, 
the XOR is a good privacy-amplification function. In Section [6l we show that we can use the XOR 
of several bits to do information reconciliation and privacy amplification such as to obtain a secret 
key. We determine the key rate, show how we can attain the region allowing for a positive key rate 
using quantum mechanics and finally give the resulting key generation protocol. 

2 Modeling Non-Signaling Adversaries 

When Alice, Bob, and Eve carry out measurements on a (joint) physical system, they can choose 
their measurement settings (the inputs) and receive their respective outcomes (the outputs). It is, 
therefore, natural to model the situation by a tripartite input-output system, characterized by a 
conditional distribution Pxyz\uvw- The question we study in the following is: Given a certain 
two-party system shared by Alice and Bob, which extensions to a three-party system, including 
the adversary Eve, are possible? And is it possible for Alice and Bob to create a secret key by 
interacting with their respective parts of the system and communicating over a public channel? 
The only condition hereby is that the entire system must be non-signaling^ i.e., the input/output 
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Fig. 5. The tripartite scenario including the eavesdropper. 

behavior of one side tells nothing about the input on the other side(s) (and also, dividing the ends 
of the box in any two subsets, the input/output behavior of one subset tells nothing about the input 
of the other) . 

Condition 1 J2F The system Pxy z\uvw must not allow for signaling: 

Pxyz\uvw( x , V, z, u, v, w) = ~^2 x Pxyz\uvw(x, y, z, u',v, w) Vy, z, v, w 
~^2 y Pxyz\uvw( x , y, z, u , v, w) = ^ Pxyz\uvw{x, y, z, u, v',w) Vx, z, u, w 
J2 Z p xyz\uvw(x, y, z, u, v, w) = p xyz\uvw(x, y, z, u, v, w') Vx, y, u, v 



If a system is non-signaling between its interfaces, this also means that its marginal systems are well- 
defined: What happens at one of the interfaces does not depend on any other input. This implies 
that at all the interfaces, an output can always be provided immediately after the input has been 
given. 

On the other hand, we do allow for Eve to delay her choice of input (measurement) until all of 
Alice's and Bob's communication is finished — in particular Eve knows the protocol of Alice and Bob 



7 In practice, the non-signaling condition can be ensured by carrying out all measurements in a space-like separated 
way (the system is then non-signaling by relativity theory) or, alternatively, by placing every partial system into 
a shielded laboratory. It is also a direct consequence of the assumption usually made in quantum key distribution, 
that the Hilbert space is the tensor product of the Hilbert spaces associated with each party. 



and could get information about Alice and Bob's inputs, e.g. by wiretapping messages exchanged 
by them during the protocol, and she can adapt her strategy. 

This tripartite scenario can be reduced to a bipartite one: Because Eve cannot signal to Alice 
and Bob (even together) by her choice of input, we must have 

X) 2 p xyz\uvw( x , V, z, u, v, w) = J2 Z p xyz\uvw( x , V, z, u, v, w') = P X y\uv{x, y, u, v) , 

and this is exactly the marginal box as seen by Alice and Bob. We can, therefore, see Eve's input 
as a choice of convex decomposition of Alice's and Bob's box and her output as indicating one part 
of the decomposition. Further, the condition that even Alice and Eve together must not be able to 
signal to Bob and vice versa means that the distribution conditioned on Eve's outcome, P X y\UV> 
must also be non-signaling between Alice and Bob. Informally, we can write 
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and this also covers all possibilities available to Eve. Formally, we define: 

Definition 3. A box partition of a given bipartite box Pxy\uv 1S a family of pairs {p z iP'xy\uv^ 
where p z is a weight and P X y\uv * s a ^ ox ' sucn that Pxy\uv = J2 Z P Z ' P xy\uv 

This definition allows us to change between the scenario of a bipartite box plus box partition and 
the scenario of a tripartite box, as stated in the following two lemmas. 

Lemma 2. For any given tripartite box, Pxy z\uvw > an V input w induces a box partition of the 
bipartite box Pxy\uv parametrized by z with p z := p(z\w) and P X y\uv := p xy\uv,z=z,w=w 

Lemma 3. Given a bipartite box Pxy\uv ^ W 6e a set of box partitions w = {(p z , P X y\uv^ z ' 
Then the tripartite box, where the input of the third party is w G W, defined by Pxy z\uv .w=w( z ) := 
p z ■ P^ y . uv is n o n - s i g u a /i ng and /las marginal box Pxy\uv- 

Even if Alice and Bob have several input and output interfaces we need to assume that they 
belong to a single big system which can be attacked by Eve as one, as depicted in Figure El This 
also implies that Eve only has a single input and output variable (of any range). This scenario is 
analogous to Eve being able to do coherent attacks in a quantum-key-distribution protocol. 
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Fig. 6. Alice and Bob share n boxes which are independent from their viewpoint. However, Eve can 
attack all of them at once. 



However, Alice and Bob can make sure that the non-signaling condition holds between all of 
their In input /output interfaces. The non-signaling condition then needs to hold even given Eve's 



Fig. 7. The dashed lines mean space-like separation. 



output z. We, therefore, extend Condition [T] from the tripartite to the (2n + l)-partite case in the 
obvious way and call such a system (2n + l)-partite non-signaling (see Figure [7]). 

We study the particular case where Alice and Bob share n approximations of a PR box, i.e., 
each of the 2n input /output interfaces takes one bit input and gives one bit output Jf| Note that 
we assume that the boxes Alice and Bob share were created by Eve. We can, therefore, not make 
any assumption about their form (i.e., the probability distribution describing them). In particular, 
they need not be independent approximations of PR boxes. However, Alice and Bob can test the 
properties of their systems and can ensure that the non-signaling condition holds between all 2n 
ends and even given Eve's output z, i.e., Pxyyuv mus * no * auow f° r signaling between any of the 
2n input/output bit pairs shared between Alice and Bob. We restate the condition, under which we 
will prove security: 

Condition [IJ The system Pxyz\uvw must not allow for signaling between any of the 2n + 1 
marginal systems: 

^PxYz\vvw(x,y,z,u\ui,Ui,v,w) = ^Px_YZ\XJVw( x ,y,z,u\ui,u' i ,v,w) Vx\xi,y,z,,u\ui,v,w 
^2 p x.YZ\uvw( x ,y, z , u ,v\vi,Vi,w) = ^P X YZ|uvvi/( x >y^> u ) v \^ 5 ^ ; w ) Vx, y\y if z, , u, v\t>i, w 
^ P XYZ|uvw(x,y,2,u, v,u>) = ^PxY£|UVw( x ,y,^u,v,-u/) Vx,y,u,v , 

z z 

where we used the notation x\xj to abbreviate xi, . . . , Xi—i, Xi+i, . . . x n , i.e., all Xj for which j ^ i. 

Note that the above conditions imply the non-signaling condition between any partition of the 
input/output interfaces. An explicit proof of this is given in Appendix lC.il 



3 Security Definition 
3.1 Indistinguishability 

We define security in the context of random systems [35 . A system is an object taking inputs and 
giving outputs — such as, for example, a box or several boxes. The different interfaces, number of 
interactions, and, if there is, the time-wise ordering of these inputs and outputs is described in the 
definition of the system. 

The closeness of two systems Sq and S\ can be measured by introducing a so-called distinguisher. 
A distinguisher T> is itself a system and it has the same interfaces as the system <So, with the only 
difference that wherever Sq takes an input, T> gives an output and vice versa. In addition, T> has 
an extra output. The distinguisher T> has access to all interfaces of So, even though these interfaces 
might not be in the same location when the protocol is executed (for example, one of the interfaces 
might be the one seen by Alice, while the other is the one seen by Eve). 

8 We will write U for the random bit denoting Alice's input, bold-face letters U will denote a n-bit random variable 
(i.e., an n-bit vector), Ui a single random bit in this n-bit string and lowercase letters the value that the random 
variable has taken. A similar notation is used for Alice's output X and Bob's input and output V and Y. No 
assumption is made about the range of Eve's input/output variables W and Z. 



Fig. 8. A system. 
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Fig. 9. The distinguisher 



Now consider the following game: the distinguisher T> is given one out of two systems at random 
- either So or Si — but the distinguisher does not know which one. It then has to interact with 
the system and output a bit B at the end, guessing which system it has interacted with. The distin- 
guishing advantage between system So and Si is the maximum guessing advantage any distinguisher 
can have in this game (see Figure [9]) . 

Definition 4. The distinguishing advantage between two systems So and Si is 
5{S ,Si) = max[P(£ = 1|«S = S ) - P(B = l\S = Si)]. 

Two systems So and Si are called e-indistinguishable if 5(So,Si) < e. 

The probability of any event £ when the distinguisher T> is interacting with So or Si cannot 
differ by more than this quantity. 

Lemma 4. Assume two e-indistinguishable systems So and Si. Denote by P{£ \So, T>) the probability 
of an event£, defined by any of the input and output variables, given the distinguisher T> is interacting 
with the system Sq. Then 

P(£\S ,V) < P(£\Si,V) + e 

Proof. Assume P(£\Sq,D) > P(£\Si,T>) + e and define the distinguisher V such that it outputs 
B = whenever the event £ has happened and whenever £ has not happened it outputs B = 1. 
Then this distinguisher reaches a distinguishing advantage of 5(So,Si) > e contradicting the as- 
sumption that the two systems are e-indistinguishable. □ 



3.2 Security of a Key 

The security of a cryptographic primitive can be measured by the distance of this system from an 
ideal system, which is secure by definition. For example, in the case of key distribution the ideal 
system is the one which outputs a uniform and random key (bit string) at one end and for which all 
other input/output interfaces are completely independent of this first interface. This key is secure 
by construction. If the real key distribution protocol is e-indistinguishable from the ideal one, then 
by Lemma H] the key obtained from the real system needs to be secure except with probability e. 
This is because the probability that an adversary has knowledge about the key is in the ideal case. 
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Fig. 10. The real and ideal system for the case of key distribution. 



Definition 5. A key S is e-secure if the system outputting S is e-indistinguishable from an ideal 
system which outputs a uniform random variable S and for which all other input/output interfaces 
are completely independent of the random variable S. 

This definition implies that the resulting security is universally composable |39|4|15| . In fact, 
assume by contradiction that there exists any way of using the key (or any other part of the system 
which generates the key) such that the result is insecure, i.e., distinguishable with probability larger 
than e from the ideal system. This process could then be used to distinguish the key generation 
scheme from an ideal one with probability larger than e, which is impossible by definition. 



3.3 Security of Our Key Agreement Protocol 

The system we consider (see Figure [TT]) is the one where Alice and Bob share a public authenticated 
channel plus a quantum state (modeled as a box). Eve can wire-tap the public channel and choose 
an input on her part of the box and obtain an output (i.e., measure her part of the quantum 
state). Similar to the quantum case, it is no advantage for Eve to make several box partitions 
(measurements) instead of a single one, as the same information can be obtained by making a 
refined box partition of the initial box. Without loss of generality, we can, therefore, assume that 
Eve gives a single input at the end (after all communication between Alice and Bob is finished). In 
our scenario, Eve, therefore, obtains all the communication exchanged over the public channel Q, 
can then choose the input to her box W (which can depend on Q) and finally obtains the outcome 
of the box Z. If Alice and Bob apply a protocol tt to the inputs and outputs of their boxes and the 
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Fig. 11. Our system. Alice and Bob share a public authentic channel and a quantum state. When 
they apply a protocol tt to obtain a key, all this can together be modeled as a system. 



information exchanged over the public channel to obtain a key, this protocol can also be included 
in the system. The new system now outputs the key Sa on Alice's and Sb on Bob's side. Obviously 
Eve's possibilities to interact with this system has not changed. We will, therefore, need to bound 
the distance between this system and the ideal system^ 



9 Note that we can consider the distance of Sa from an ideal key and the distance between Sa and Sb (probability 
of the keys to be unequal) separately. By the triangle inequality, the distance of the total real system from the ideal 
system is at most the sum of the two. 
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Fig. 12. Our system. The distribution of the random variable S in the ideal case is such that 
Ps(s) = 1/\S\. 



The following corollary is a direct consequence 10 ! of the definitions of the systems in Figure [12] 
and the distinguishing advantage. 

Corollary 1. Assume a key S generated by a system as given in Figure fiH Then 

S{S rea i,Sideai) = 1/2 • y~] max Pz,q\w=w{ z > q) ' \Ps\z= z ,Q= q ,w=w( s ) - Pu\, 

s,q z 

where w is chosen such as to maximize this quantity and Pjj := 1/|<S|. 

This quantity will be the one that is relevant for our security definition and because it corresponds 
to the distance from uniform of the key from the eavesdropper's point of view, we will in the 
following call it the distance from uniform of S given Z{W) and Q, where we write Z(W) because the 
eavesdropper can choose the input adaptively and the choice of input changes the output distribution. 

Definition 6. The distance from uniform of S given Z(W) and Q is 

d(S\Z(W),Q) = 1/2 • ^m&xJ2 p z,Q\w=w{z,q) • \Ps\z=z,Q= g ,w=w(s) - Pu\ ■ 

s,q z 

4 Secrecy from a Single Box 

Let us take a closer look at the simple case where the protocol tt directly takes the output of an 
imperfect PR box as a key. More explicitly, Alice and Bob share an imperfect PR box — one that 
fulfills P(X (B Y = U ■ V) = 1 — e for uniform inputs. Alice and Bob use the box giving a random 
input and obtain an output. Then they announce their inputs over the public authentic channel, 
i.e., Q := (U = u, V = u)o We will show in this section that Eve can get some knowledge about 
Alice's outcome X depending on e, but the distance from uniform from her point of view is limited 
by 2e (assuming she gets to know the input). 

Lemma 5. Assume a tripartite box Pxy z\uvw such that the marginal Pxy\uv ^ s a non-local box 
with 1/4 • J2x®y=u-v P XY\Uv( x ,y, u , v ) = 1 - e and Q := (U = u,V = v). Then 

d(X\Z(W),Q) < 2e . 

Proof. Consider w.l.o.g. the case X = 0. First we generalize the table from Section 11.41 to the 
case where P{X © Y ^ U ■ V) = e on average (and it is not necessarily e for every single in- 
put). We call £j the probability not to fulfill the CHSH condition (X ® Y ^ U ■ V) for the inputs 

For the formal proof it is useful to note that instead of a box taking input W, we can consider a box giving outputs 
indexed by w, Z w , of which one is selected. This reflects that the box considered is non-signaling. 
11 We will, in a certain abuse of notation, allow Q to consist of both random variables and events that a random 
variable takes a given value. In case of such events {U = it}, this means that the distance from uniform will hold 
given this specific value u, whereas taking the expectation over Q will correspond to taking the expectation over all 
the "free" random variables contained in Q. 



{(0, 0), (0, 1), (1, 0), (1, 1)} respectively. Suppose w.l.o.g. that the input was (0,0), so X should be 
maximally biased for this input. 
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Because P[X © Y + U ■ V\U,V = 0,0]= e x , the bias of Y, given U = V = 0, must be at 
least p — Ei. Because of non-signaling, X's bias must be p as well when V = 1, and so on. Fi- 
nally, P[X @Y ^ U ■ V\U,V = 1,1] = E 4 implies p - e 2 - (1 - (p - £i - £3)) < £4, hence, 
p < 1/2 + 1/2 £^ £j = 1/2 + 2e. Now consider a box partition of 7xy|[/v parametrized by z. Let 
denote of the box given Z = z, i.e., e z = 1/4 • £V £j j2 . Because this box must still be non-signaling, 
the bias of X given Z = z, U = u and V = v is at most 2e z by the above argument. However, 
because Pxy\uv = Y1 Z P Z ' Pxyyuv we a ^ so nave e = ' e z an< ^ because this further holds for 

all values of X, d(X\Z(W),Q) < J2 z p z ■ 2e z = 2e. □ 

Remark 1. Note that there exists a box partition which reaches this bound, and that can be found 
through a straight-forward maximization. The explicit calculations are given in Appendix lAl 

Boxes Pxy\uv that approximate a PR box with error e G [0, 0.25) are non-local. We see that for any 
non-local box, Eve cannot obtain perfect knowledge about Alice's output bit, and the box, therefore, 
contains some secrecy. 

5 Privacy Amplification 

In the following, we consider the case where Alice and Bob share n imperfect PR boxes and the 
key is obtained by taking the XOR of all n output bits. We will show in this section, that taking 
the XOR of the outputs of several boxes is a good privacy amplification function. At first, we will 
assume that the boxes as seen by Alice and Bob are n independent and unbiased (i.e., for all inputs, 
the outputs X and Y are equally likely to be or 1) boxes each with an associated error Ei (which 
is the same for all inputs). Then, we will show that this also holds for the case when Alice and Bob 
share a convex combination of independent unbiased boxes. Indeed, Alice and Bob can apply a local 
mapping to their inputs and outputs to obtain a marginal box that is the convex combination of 
several independent and unbiased boxes and, therefore, enforce this situation [30 31 J. The details of 
this local mapping — called depolarization — are described in Appendix [Ej Finally, we completely 
remove the criterion of independence and show that the distance from uniform of the XOR of the 
outcomes of any (2n + l)-non-signaling system having binary inputs and outputs cannot be larger 
than what could be obtained from its "depolarized" version. 
The main result of this section will be the following lemma. 

Lemma 6. Assume a (2n+l) -partite box Pxyz\uvw such that the marginal PxY\uv corresponds to 
n independent and unbiased non-local boxes each with an associated error Ei. Assume f(X) := ® ■ Xi 
and Q := (17= u,V= v,F = 0). Then 

d(f(X)\Z(W),Q)< 1/2 -HXAe,) (<l/2.(4eD, (3) 

where e = ^Y.i £ i- 

Note that this value can easily be reached by attacking each box independently, such as given in 
Appendix[Al and this bound is, therefore, tight. 

For the proof of Lemma [6] we will proceed in several steps. First, we show that the problem of 
finding the maximum distance from uniform of the XOR of several output bits can be cast as a 



linear optimization problem. Then, we show that this linear program describing n boxes can be seen 
as the n-wise tensor product of the linear program describing a single box — this is the crucial step. 
By using the product form of the linear program we can then show that there exists a dual feasible 
solution — i.e., an upper-bound on the distance from uniform — reaching the above value. 

First we note that the maximal possible non-uniformity of the XOR of the output bits can be 
obtained by a box partition with only two outputs, and 1. 

Lemma 7. Assume there exists a box partition with Xi\Z'(W), Q). Then there exists a box 
partition with the same distance from uniform with Z E {0, 1}. 



(p^,P|=0) by 



Proof. Assume that the box partition has more than two elements. Define two new elements 

XY|UVJ 

p z=0 ■= p z 'i -j ytf 



1 m 

^XY|UV — nZ1 Z^P ^X- 



, r , x_ XY|UV 



where the set z[, . . . ,z' m is defined to consist of the boxes such that -P[©j X{ = 0\Z = z'^ > 1/2. 
Similarly define (p z=1 , ^xYfuv) as ^ e convex combination of the remaining elements of the box 
partition. Because the spaces of boxes is convex, this forms again a valid box partition and it has 
the same distance. □ 



It is, therefore, sufficient to consider a box partition with only two elements z = and 2=1. 
However, given one element of the box partition (Pj-fxYIUv)' ^ ne secon d element (1 — P>^xy(uv) 
is determined, because their convex combination forms the marginal box, -Pxy|uv- 

Lemma 8. Assume a box partition w with element (p, P z ^ uv ) and an unbiased bit S = f(X) such 
that w.l.o.g. P[S = 0\Z = 0,(5] > 1/2. Then the distance from uniform of S given the box partition 
w and Q = (U = u, V = v, F = f) is 

d(S\Z(w), Q) = 2-p- (P[S = 0\Z = 0, Q] - 1/2) . 

Proof. 



d(S\Z(w), Q)=p- (P[S = 0\Z = 0,Q] - 1/2) 

+(1 - p) . ( - 1)( ^-P-P[S = 0| Z = 0,Q] 

1—p 

= 2 • p ■ (P[S = 0\Z = 0, Q] - 1/2) 



□ 



The above lemmas imply that finding the distance from uniform is equivalent to finding the "best" 
element of a box partition (jp, -PxyIuv)" When can (P)-PxYrtrv) " 3e e l emen t of a box partition? The 
criterion is given in Lemma [9j 

Lemma 9. Given a box Pxy\ UV> there exists a box partition with element (p, PxY\uv) */ an< ^ on ^ 
if for all inputs and outputs x, y, u, v, 

P ■ Pxyuviwl < p xy\ uv{xy\ uv) . (4) 

Proof. The non-signaling condition is linear and the space of conditional probability distributions is 
convex, therefore a convex combination of valid boxes -PxyTuv ^ s a S a i n a valid box. To prove that 
the outcome z = can occur with probability p it is, therefore, sufficient to show that there exists 
another valid outcome z = 1 which can occur with 1—p, and that the weighted sum of the two is 



Pxy|uv- If -^xy{uv * s a normalized and non-signaling probability distribution, then so is ^xy[uv 
because the sum of the two, Pxy|uv~j is a l so non-signaling and normalized. Therefore, we only need 
to verify that all entries of the complementary box -fxYjuv are De t ween and 1. However, this box 
is the difference 



^XYjuV - J3^(^XY|UV - P ■ P X.Y\Uv) 



Requesting this to be greater or equal to is equivalent to (HJ . We observe that all entries of Pxy|uv 
are now trivially smaller than or equal to 1 because of the normalization: if the sum of positive sum- 
mands is 1, each of them can be at most 1. □ 



We can now show that the maximal distance from uniform which can be reached by a non- 
signaling adversary is the solution of a linear programming problem (see Appendix [B] for details on 
linear programming) We introduce a new variable A, which is a vector such that with each value 
of x, y, u, v, we can associate an entry of A and we write A(x, y|u, v) for this entry. A can be seen 
as a probability distribution describing a box, where the distribution need not be normalized nor 
positive. 

Lemma 10. The distance from uniform of ©j Aj given Z(W) and Q := (U = u, V = v, F = ®) 

is 

d{(Q.Xi\Z(W),Q) = 1/2 -b T -A* , 
where b T ■ A* is the optimal value of the linear program 

max: A(xy\uv) — A(xy\uv) (5) 

(x,y):f(x)=0 (x,y):f{x)=l 

s.t.: A(xy\uv) — A(xy\u'v) = \/y, v,u,u' (non- signaling from Alice to Bob) 

X X 

A(xy\uv) — A(xy\uv') = Va;, u,v,v' (non- signaling from Bob to Alice) 
v v 
A(xy\uv) < P(xy\uv) \/x, y, u, v 

A(xy\uv) > —P(xy\uv) Mx, y, u, v 

Proof. We show that every element of a box partition (p, Pxy|uv) corresponds to a feasible A and 
vice versa. 

Assume an element of a box partition (p, P^-^uv) and define 

Z\(xy|uv) = 2p • P z=0 (xy|uv) - P(xy|uv) . 

A fulfills the non-signaling conditions by linearity. Further p > and P z=0 (xy|uv) > imply 
Z\(xy|uv) > — P(xy|uv) and p • P^ =0 (xy|uv) < P(xy|uv) implies Z\(xy|uv) < P(xy|uv). A is, 
therefore, feasible. 

To see the reverse direction, assume a feasible A. Define 



l/2-(l + ^Z\(xy|0...00...0)) 



p z=o (xy|uv) = ^y|uv) + 4(xy|uv) 



2p 

(For completeness, define P z=0 (xy|uv) = P(xy|uv) in case p = 0.) To see that (Pi^xy|uv) * s 
element of a box partition note that ^2 xy A(xy\0 ... 00 ... 0) = ^2 xy Z\(xy |u'v') for all u', v' because 



In the following we drop the indices of the probability distributions as they should be clear from the context. 



of the non-signaling constraints. I.e., p is independent of the chosen input and the transformation 
is, therefore, linear. This implies that p z=0 is still non-signaling. Because 

V p z=0 fxviuv) = V p ( xy l uv ) + z K x yl uv ) = i + (2p - 1) = 1 

^ y y] ' ^ 2p 2p 

xy xy 

it is normalized. Because — P(xy|uv) < Z\(xy|uv) < P(xy|uv) and X]xy -f( x y| uv ) = 1> we have 
- 1 < Exy^( x yl --- 00 --- ) < 1 and this implies P z=0 (xy|uv) > i.e., P^juv is a box - B ^ 
Lemma O (Pj-Pxyiuv) * s e l emen t °f a box partition because 

„7— n, \ , , , nn P(xyluv) + zl(xyluv) 

p ■ P z -°(xy uv) = 1/2 • 1 + > A(xy\0 ... 00 ... L A , 

F v ; / v v l + £ ^(xy|0...00...0) 



= 1/2 • (P(xy|uv) + Z\(xy|uv)) < P(xy|uv) . 

Finally, we show that the value of the objective function for any A is exactly twice the distance from 
uniform reached by the box partition with element (p, Pxy|uv) : 

^2 Z\(xy|uv) - ^ Z\(xy|uv) 

(x,y):/(x)=0 (x,y):/(x)=l 

= ^ (2p-P Z=0 (xy|uv) -P(xy|uv)) - ^ (2p • P z=0 (xy[uv) - P(xy|uv)) 

(x,y):/(x)=l 




^ P z=0 ( X y|uv)- P Z= °(xy|uv) 

v (x,y):/(x)=0 (x,y):/(x)=l 

= 2-2p(P[/(X) = 0|Z = 0,Q]-l/2) , 
which is exactly twice the distance from uniform by Lemma El □ 



We know that there exists a feasible A which reaches a value of ni(4 £ i)> namely the A associated 
with the box partition corresponding to an individual attack. We now want to show that this value is 
also dual feasible and, therefore, optimal. First, we re- write the primal in a form with only inequality 
constraints and no equality constraints. To do so, we replace constraints of the form aj ■ A = by 
the two constraints aj ■ A < and —aj ■ A < 0. We obtain: 

max: b T ■ A min: c T X 

s.t.: A ■ A < c and its dual s.t.: A T ■ X = b (6) 

A > 

The explicit values of A, b, c and the dual optimal solution A* for the case of a single box are given 
in Appendix O Note that in the dual program, the marginal box as seen by Alice and Bob only 
appears in the objective function. The feasible region is, therefore, completely independent of the 
marginal. 

Our main tool to show optimality will be to show that we can express the linear program 
describing n boxes as the tensor product of the linear program describing one box. 

Lemma 11. Assume Ai,bx,c\ are the vectors and matrices associated with the linear program ^) 
for the case of a single box. Then the value of the program A, b, c associated with n boxes is equal to 
the value of the linear program defined 

max: {bf n ) T ■ A (7) 
s.t: Af n -A<cf n . 



We write here ci for each of the n boxes for notational simplicity. However, the marginal box Ci could actually be 
different for each of the n boxes without having to change our argument. 



Proof. We describe the case n = 2, the case of larger n is analogue. First note that with each entry 
of A for a single box there are associated input and output bits Xi, Yi, Ui, Vi. With each entry of A 
living in the tensor product space of two boxes, we can associate an entry X, Y, U, V corresponding 
to two bits each in the obvious way. 

b\ is such that the entries associated with X\ = 0, U\ = 0, V\ = is 1; X\ = 1, U\ = 0, V\ = is — 1 
and for all other inputs it is zero (the choice of input 0, is arbitrary and no restriction). b\ ® b\ is, 
therefore, such that for . ^ = 0, U = 00, V = 00 it is 1; for 0^ Xi = 1, U = 00, V = 00 it is -1 
and for all other inputs it is 0. This is exactly the form that gives us the bias of the XOR of two 
output bits given input U, V = 00, 00. 

Now let us see that A and c can also be taken of tensor product form. Indeed, we will show that the 
constraints given by Af n and cf n are either exactly the ones that describe a 2n non-signaling box or 
they are trivially fulfilled and, therefore, do not modify the value of the linear program. We can divide 
the lines of A into 4 types, we call them A n ~ s (for "non-signaling"), — A n ~ s (which contains the same 
coefficients as A n ~ s but with the sign reversed), Ii6xi6 an d — Ii6xi6 (which contains a 1 resp. —1 at 
a certain position and everywhere else) (compare with Appendix [C]). The entries of c associated 
with these types are respectively 0, 0, P(xy|uv) and P(xy|uv) (the marginal probabilities). 
Now consider Af 2 and cf 2 . We now have 16 types of rows, corresponding to all possible combinations. 

1. Type Ii6xi6 ® Il6xi6 = 1256x256 The associated c is P(xiyi\uiVi) ■ P{x 2 y 2 \u 2 v 2 ) (i.e., the prob- 
ability entry of the two boxes) and these constraints correspond exactly to the upper bound on 
A in the case of two boxes. (Type — Ii6xi6 ® — Ii6xi6 = 1256x256 is exactly the same row and, 
therefore, follows from this one). 

2. Type -Ii6xi6 ® Ii6xi6 = -1256x256 The associated c is P(si2/i|«iUi) • P(x 2 y 2 \u 2 v 2 ) and these 
constraints correspond exactly to the lower bound on A in the case of two boxes. (Type Ii6xi6 ® 
— Ii6xi6 = —1256x256 is exactly the same row and, therefore, follows from this one). 

3. The lines of the form A n ~ s <£) Ii6xl6 correspond exactly to the non-signaling constraints for 
two boxes. To see this assume that the non-signaling constraint on the first box is of the form 
J2 X P( x ii Vl v i) — J2x P( x i> yi\ u 'i v i) and the identity on the second box is 1 at the position 
x 2 ,V2,V2,u 2 and everywhere else. Then the constraint A n ~ s (g> Ii6xi6 corresponds to 

y~]P(.xi,X2,yi,y 2 \ui,U2,vi,V2) - y^P{xi,x 2 , yi, y 2 \u[, u 2 , vi, v 2 ) 

X\ Xl 

and this is exactly the form of a 2n non-signaling constraint. The associated entry of c is, as 
expected, • P(x 2 , U2\u2, ^2) = 0. Together with the constraints of the form Ii6xi6 ® A n ~ s we 
obtain all the non-signaling constraints for the two boxes. (Type — A n ~ s ® — Ii6xl6 is again 
exactly the same row). 

4. The lines of the form — j4 n-s (g>li6xi6 and Ii6xi6® — A n ~ s give the same non-signaling constraints 
as above but with reversed sign, therefore, enforcing the equality constraint by two inequality 
constraints. (A n ~ s <g> — Ii6xi6 an d — Ii6xi6 ® A n ~ s are again exactly the same rows and are, 
therefore, trivially fulfilled.) 

5. Remain the lines of the form A n ~ s <g> A n ~ s . Their associated c is • = 0. However, the second 
non-signaling constraints can be seen as a linear combination of the identity constraints, i.e., 
A n ~ s ® A n ~ s = A n ~ s <g> ■ Ii6xi6 fc)- Because of the linearity of the tensor product in the 
second component, this constraint is, therefore, the linear combination of the constraints given 
in point [2] and [3] above and because each of them is equal to 0, their linear combination is also 
equal to and this constraint is, therefore, trivially fulfilled whenever the above constraints are. 
The same argument holds for the rows — A n ~ s <g> A n ~ s , A n ~ s <g> — A n ~ s and — A n ~ s <g> — A n ~ s . 

□ 

Now we consider the dual program of ([7]). Using Lemma [TT1 we see that if Ai is a feasible dual solution 
for a single box, then \f n is feasible for n boxes. 

Lemma 12. For any Xi which is dual feasible for the linear program A\,b\ associated with one box, 
^ i Xi is dual feasible for the linear program ^ associated with n boxes. Further, this dual feasible 
solution has value c^X n = Yli( c T^i)- 



Proof. X{ is dual feasible for Ai,b%, i.e., AjX\ = bi and Ai > 0. Then 

A T n \ n = (Af n ) T ((g) h) = \) = (g)(Af Ai) = (hf n 

i i i 

and 0j(Aj) > 0, i.e., X n = (g^ Aj is dual feasible. Its value is c n X n = (g^ Cj ■ Ai = (^(cjAi) = 

riiMi). ' □ 

Now we are ready to give the proof of Lemma 

Proof (of Lemma\D$. For a single box d(X\Z(W), Q) < 1/2 • (4ej) by Lemma[5](see Section^]), this 
implies that there exists a dual feasible Aj, such that cf Aj < 4ej for each i. By Lemma [T2l there 
exists a dual feasible X n such that c^A n < ]X(4£i) — (4e) ra and, therefore, by Lemma [TU1 



d(0. Q) = 1/2 • c^A* < 1/2 • c T n X n = 1/2 • J]. (4^) < 1/2 • (4e) n . 

□ 

This implies that if Alice and Bob create a single key bit by applying the XOR to their outputs 
there is no advantage for Eve to do a collective or coherent attack, as the above distance from 
uniform can be reached by an individual attackf^l 

We now want to remove the condition that the marginal boxes of Alice and Bob need to be 
independent. First we consider the case when Alice and Bob share the convex combination of n 
independent and unbiased boxes of different errors. The reason to consider this case is because no 
matter what boxes Alice and Bob share — they can be arbitrarily correlated — Alice and Bob can 
apply a random mapping to their input and output bits (see Appendix[E]), such that the distribution 
they share after this mapping in fact is the one of a convex combination of several independent and 
unbiased boxes with different errors [30 31J. The statement of Lemma still holds here: 

Lemma 13. Assume a (2n + l)-partite box Pxyz\uvw such that the marginal Pxy]uv corresponds 
to a convex combination with weight pj of n unbiased non-local boxes each with an associated error 
4- Assume f{X) := and Q ■= (U = u, V= v,F = ©). Then d{f{X)\Z(W),Q) < ZjPj ■ 

Proof. Note that for a single box the dual optimal solution is A^ for all c\ describing a single box 
(i.e., c[ ■ X* = 4e for all c\) (see Appendix [U]) . For n boxes, Xf n is still dual feasible. It reaches a 

value of c T n XT = fcp;^')) • (Ai)® n = £ jPj n ( (4fij). □ 



Now we want to remove any requirement of independence. Lemma [Testates that choosing boxes 
which are not independent cannot be an advantage for Eve and the above bounds still hold. 

Lemma 14. Assume a (2n + l) -partite box Pxyz\uvw with any marginal Pxy\uv- Assume f(X) := 
® i Aj and Q := (U = u, V = v, F = 0) and the distance from uniform d(f(X)\Z(W),Q). 
Now assume a second (2n + \)-partite box with marginal P'xy\uv obtained from PxY\UV by de- 
polarization and with distance from uniform d'(f(X)\Z(W),Q) (with the same Q and f). Then 
d(f(X)\Z(W),Q)<d'(f(X)\Z(W),Q). 

Proof. We know that for PxyIUV d ' U 0^)\ z ( w ) ^ Q) = J2jPj ' V 2 • UiiM) h Y Lemma [Lj and 
because this bound can easily be reached by attacking each box separately. However, this value is 
exactly the sum of all probabilities where none of the CHSH conditions are fulfilled (i.e., where 
Xi Yi ^ Ui ■ Vi for all *)■ 

Now consider -Pxy|uv- -^xyiuv can ^ e seen as ^ ne convex combination of all the -Pxy|uv to which 



Note that individual attacks are optimal only in this specific case and, in general, they are strictly weaker than 
collective or coherent attacks. We give an example of such a collective attack in Appendix [F] 



one of the mappings given in Appendix [E] has been applied. However, the distance from uniform 
for -Pxy|uv ( or their mappings) is limited by the sum of all probabilities where none of the CHSH 
conditions are fulfilled and this holds for all values of the input u, v (in Appendix [UJ for each input 
u, v a dual feasible solution reaching this value is given). By comparison with Appendix [El we see 
that the mappings (for each input) leave @ i x% unchanged (up to a relabeling between and 1). 
The mappings also leave the sum of probabilities where none of the CHSH conditions are fulfilled 
unchanged, because x'^y'^u'^v^ not fulfilling the CHSH condition are mapped to Xi,yi,Ui,Vi not 
fulfilling the CHSH condition. We conclude 

d(f(X)\Z(W),Q) < A* ®" • c = Yl ^XY|uv(xy|uv) 

E ^ Y | UV (xy|uv) = d'(f(X)\Z(W),Q) . 

x,y,u,v.Xi®yi^Ui-Vi Vi 

□ 

6 Full Key Agreement 

6.1 Privacy Amplification: From One to Several Bits 

We have seen in the previous section that it is possible to create a highly secure bit using a linear 
function — the XOR. But obviously we would like to extract a secure key instead of a single bit. 
Alice and Bob will create all the key bits the same way: by applying a random linear function to 
the output bits, i.e., S := A X, where A is a s x ?7,-matrix over GF{2) with p(0) = p(l) = 1/2 for 
all entries and we write for the multiplication modulo 2. Let us now see why this key is secure. 
First, we reduce the security of the key S to the question of the security of every single bit. 

Lemma 15. Assume S := [Si, ... ,S S ], where Si are bits. Then 

d(S\Z(W),Q)<Y,d(S i \Z(W),Q,S 1 ,...,S l „ 1 ) . (8) 

i 

Proof. 

d(S\Z(W),Q) = Y J m & y: J2\ p s,z,Q\w=w{s,z,q) - y s ■ P z ,Q\w=w(z,q)\ 



< 



Emax > 
in ^— ' 



s,q 



\Ps,z,Q\w=w(s,z,q) - - ■ Ps 1 ...s 3 ^ 1 ,z,Q\w=w(si, ■ ■ ■ ,s Sl ,z,q)\ 



+ ■■■ + •pZT\ P S 1 ,Z,Q\W=u(8i,z,q) - - • Pz,Q\w=w(z,q)\ 
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where the first equation is by the definition of the distance from uniform and the second inequality 
is by the triangle inequality. □ 



We now need to bound the distance from uniform of the i'th key bit given all previous bits. 

Lemma 16. Assume S := A X, where A is a i x n-matrix over GF{2) and be Pa the uniform 
distribution over all these matrices. Q := {U = u, V = v, A). Then 

d(Si\Z{W), Q, S u ..., S^t)< 1/2 ■2 i - 1 -(^^-) . (9) 



Proof. Bounding the distance from uniform of Si given Si,... , Si-i corresponds to bounding the dis- 
tance from uniform of Si given all linear combinations over GF{2) of S%, . . . , Si-i (see Appendix lG.2p . 
For each linear combination © je / Sj define the random bit S c = c X where c = ©j g j a>j © a>i 
and a,j denotes the jth line of the matrix A. Note that S c is a random linear function over X (the 
proof of this is given in Appendix IG.3p . If S c is uniform and independent of Si, . . . , Si-i, then Si is 
uniform given this specific linear combination. However, the distance from uniform and independent 
of S c is given by Lemma [6] (note that Lemma [6] bounds not only the distance from uniform of S c 
given Z, but also given all Xj not included in S c , as these could be included in the variable Z). We 
obtain 



d(c X.\Z(W),Q) < 1/2 • 1 ]T [J (4et) < 1/2 ■ 1 £ (?) (4e) fc = 1/2 • (I 



4e 



KCnieK k=0 

where the second inequality follows from the fact that this expression is maximized when all e% are 
equal (see Appendix IG.4I for a proof of this). If a random variable S has distance from uniform at 
most d, then we can define an event £ occuring with probability at least 1 — d such that given 8, S 
is uniform. By the union bound over all 2*™ 1 possible linear combinations of Si, ... , Sj_i, we obtain 
the probability that Si is uniform given Si,... , Sj-i and, therefore, the bound on the distance from 
uniform 

d(Si\Z(W), Q, Si,..., Si-i)< 1/2 ■ 2* ■ 0-^) ■ (10) 

□ 

Now we can bound the distance from uniform of a key S := Si . . . S s by Lemma [T5l and [T6l 

Lemma 17. Assume S := A X, where A is a s x n-matrix over GF(2) and be Pa the uniform 
distribution over all these matrices. Q := (U = u, V= v, A). Then 

d(S\Z(W), Q) < 1/2 • 2 s ■ m^) n . (11) 



Proof. By Lemma [TJ] and [TU] 

i{mQ) <_ 1/2 ■ (i±i£)" . <t 2 «) < 1/2 . (i±i£)" ■ < 1/2 ■ r' x + ^ " 

where the second inequality follows from the expression for geometric series. □ 



6.2 Information Reconciliation 

In general, the outputs x and y of Alice and Bob are not equal but have a certain probability to 
differ. Alice and Bob, therefore, need to do information reconciliation. They can do this the same 
way they create the key, namely by using a random linear code. This follows directly from a result 
from [16J about two-universal sets of hash functions and from a result from [2] about information 
reconciliation. We restate the theorems below. 

Theorem 1 (|16j). The set of functions /a(^) ■= AQx, where A is any n x m-matrix over GF{2) 
is two-universal. 

Theorem 2 ([14]). Suppose an n-bit string x another n-bit string y obtained by sending x over 
a binary symmetric channel with error parameter 5. Assume the function f : {0, l} n — > {0, l} m is 
chosen at random amongst a set of two-universal functions. Choose y' such that du(y, y') is minimal 
among all strings r with f[r) = f(x). Then P x ^y' < 1 — e~ 2 ™ h(,5+e) m -|- ( 1 °g"-)^ £ ( 1 s ) _ 



The above theorems show that in the limit of large n, m = \n ■ h(5)~\ (where 5 is the probability 
that Bob's bit is different from Alice's and h the binary entropy function), is both necessary and 
sufficient for Bob to correct the errors in his raw key, i.e., the protocol is e'-correct for any e' > 0. 

If Alice and Bob communicate m bits during the information reconciliation phase, then the 
security of the key after information reconciliation can be calculated by replacing in Lemma [T71 the 
length of the key by the length of the key plus information reconciliation, i.e., s^s+m and we 
obtain the following lemma. 

Lemma 18. Assume [S, R] := A X, where A is a (s + m) x n-matrix over GF(2) and be Pa the 
uniform distribution over all these matrices. Q := (U = u, V = v, A). Then 

d(S\Z(W),Q, R) < 1/2 • 2 s+m • 0-^) n ■ (12) 



6.3 Key Rate 

The key rate is the length of the key divided by the number of boxes used in the limit of a large 
number of boxes. Because we only need a small number of boxes for parameter estimation (see 
Appendix [D]), this will asymptotically correspond to q := s/n. From Lemma [TBI we can calculate 
the key rate by setting m := h(5) ■ n (see also Protocol [T] in Section [6.51 for a detailed description of 
the protocol). 

Lemma 19. The protocol reaches a key rate q of 

g = l-fc(,S)-log 2 (l + 4£). (13) 

Proof. From Lemma [TBI and by the definition of the key rate, we can see that the protocol reaches 
a key rate q if 

2 h(8) . 2? . l+i£ < L 

□ 

Corollary [2] states for which parameters key agreement is possible (see Figured]). 

Corollary 2. The protocol reaches a positive key rate if e < 2~ h ^~ 1 — 1/4. 

If the boxes have the same error for all inputs (5 = e) then m := n ■ h(e) and the protocol 
does not reach a positive secret key rate for e = the minimum value reachable by quantum 

mechanics. To reach a positive key rate using quantum mechanics, Alice and Bob will, therefore, 
need to use different boxes, as described in the next section. 



6.4 The Quantum Regime 

To get a positive key rate in the quantum regime, Alice and Bob use a box which gives highly 
correlated output bits given input (0, 0) (see Figure [13]) and generate their raw key only from these 
outputs. El The parameter limiting Eve's knowledge is then still e = 1 j '^'Ylx®y^u-v Pxy\uv( x i Ui n > v )i 
the parameter defining the amount of information reconciliation necessary is, however, the error in 
the correlation given input (0,0) (5 in Figure PTB"]) . Note that in a noiseless setting the distribution 
described in black font can be achieved by measuring a singlet state (see Protocol [TJ below). In 
that case, Alice and Bob will have perfectly correlated bits (and therefore would not need to do 
any information reconciliation), and the parameter limiting Eve's knowledge is e = 0.1875. The 
parameters 5 and e (in light gray font in Figure [T3]) are introduced to account for the noise in the 
state and/or measurement. 

15 Another way to reach a positive key rate in the quantum regime is to use a type of non-locality characterized by a 
different Bell inequality allowing for a higher violation in the quantum regime. See [29] for details. 
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Fig. 13. The quantum box used for key agreement. 

6.5 The Protocol 

In the following we give a detailed description of our key agreement protocol. 
Protocol 1. 

1. Alice creates n + k maximally entangled states \ X F~) = -^(|01) — |10}) ; for some k = 0(n), and 
sends one Qbit of every state to Bob. 

2. Alice and Bob randomly measure the i'th system in either the basis Uq or U% (for Alice) or Vq 
and V\ (Bob); the four bases are shown in Figure [14\ All the 2(n + k) measurement events are 
pairwise space- like separated. 

3. They randomly choose n of the measurement results when both measured Uq, Vq to form the raw 
key. 

4- For the remaining k measurements they announce the results over the public authenticated chan- 
nel and estimate the parameters e and 5 (see Appendix\Bty. They also check whether they have 
obtained roughly the same number ofl's and O's (for the information reconciliation scheme). If 
the parameters are such that key agreement is possible (Figure^) they continue; otherwise they 
abort. 

5. Information reconciliation and privacy amplification: Alice randomly chooses a (m + s) xn-matrix 
A such that p(0) = p(l) = 1/2 for all entries and m := \n ■ h(5)~\ . She calculates AQ x (where x 
is Alice's raw key) and sends the first m bits to Bob over the public authenticated channel. The 
remaining bits form the key. 




/ 

/ 

/ 

/ 




Fig. 14. Alice's and Bob's measurement bases in terms of polarization. 



Lemma [TH] and [TO] imply that this protocol allows for secure key agreement, as stated in the 
following theorem. 

Theorem 3. Protocol [JJ achieves a positive secret-key-generation rate as soon as the parameter 
estimation shows an approximation of PR boxes with an accuracy exceeding 80% and a correlation 
of the outputs on input (0,0) higher than 98%. There exists an event A with probability Prob [A] = 
2 _J7 ( n ) such that given A does not occur and the protocol is not aborted, then Alice and Bob share a 
common key that is perfectly secret, where this secrecy is based solely on the non-signaling condition. 

The above protocol also allows for traditional entanglement-based quantum key agreement |20j . 
Therefore, we have the following. 

Corollary 3. Protocol [JJ allows for efficient information-theoretic key agreement if quantum OR 
relativity theory is correct. 

7 Concluding Remarks and Open Questions 

We propose a new efficient protocol for generating a secret key between two parties connected by 
a quantum channel whose security is guaranteed solely by the fact that the measured correlations 
violate a Bell inequality. Quantum mechanics guarantees the protocol to work, i.e., the required 
correlations to occur. But the security proof is completely independent of quantum mechanics, once 
the non-local correlations are established and have been verified by the legitimate partners. 

The practical relevance of this fact is that the resulting security is device-independent: We could 
even use devices manufactured by the adversary to do key agreement. The theoretical relevance is 
that the resulting protocol is secure if either relativity or quantum theory is correct. This is in the 
spirit of modern cryptography's quest to minimize assumptions under which security can be proven. 

Our scheme requires space-like separation not only between events happening on Alice's and 
Bob's side, but also between events in the same laboratory. It is a natural open question whether 
the space-like-separation conditions can be relaxed. For instance, is it sufficient if they hold on one 
of the two sides? Or in one direction among the n events on each side? Obviously, the latter would 
be very easy to guarantee in practice. 
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ger for helpful discussions, and Hoi-Kwong Lo for bringing reference [36J to our attention. EH and 
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Appendix 



A Best Box Partition of a Single Box 

In this section, we show that the bound derived in Lemma [5] is tight. 

Lemma 20. Assume a box Pxy z\uvw such that the marginal Pxy\uv is a non-local box with 1/4- 
Ylx(3y=u-v Pxy\uv ( x > V-> u -> v ) = 1 — £ and e < 0.25. Then there exists a box partition w such that 
knowing the inputs, Z gives binary erasure information about X and P(Z € {0, 1}) = 4e. This box 
partition reaches 

d(X\Z(W),Q) = 1/2 -4e 

forQ:= (U = u,V = v). 

Proof. The proof is the following box partition: 
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To see that (fl~4|) indeed defines a box partition, notice that the parameters 02, 03, 62, ^3, C2, C3, 
di, c?4 (the ones for which the CHSH condition is not fulfilled, i.e., x © y ^ u ■ v ) fully characterize 
any box. By the normalization °i = lj an d similar for 6, c and d) and non-signaling condition 
(01 + a<i = 61 + 62; and similar for the other rows and columns) we can express a\ as 



a 1 



a-2 - 03 + 62 - 63 - c 2 + c 3 + di - dA 



This shows that the right-hand side and left-hand side of the equation are indeed equal. Because we 
assumed 4e < 1, the above decomposition represents a convex combination of several boxes and is, 
therefore, itself a box. 

With probability 02 — 03 + 62 — 63 — 02 + 03 + ^1 — ^4 = 4e, Z is such that P X y\uv * s l° ca l 
deterministic (i.e., X (Y) is a deterministic function of U (V)), in which case knowing U = u and 
V = v, Z gives perfect information about X [Z £ {0, 1}). With probability 1 — 4e Z is such that 



P 



XY\UV 



is a perfect non-local box in which case Z cannot give any information about X by the 



non-signaling condition (Z = +) 



□ 



B Linear Programming 

In this section, we very briefly state the main facts about linear programming that we use for our 
argument. See, for example, |13j for a more detailed introduction. 

A linear program is an optimization problem with a linear objective function and linear inequality 
(and equality) constraints, i.e., it can be expressed as 

max: b T ■ x 
s.t. A ■ x < c , 



where x is the variable we want to optimize. An x which fulfills the constraints is called feasible. The 
set of feasible x is convex, more precisely, a convex polyhedron, that is, a convex set with a finite 
number of extremal points (vertices). A feasible x which maximizes the objective function b T ■ x, 
is called optimal solution and is denoted by x*. The value of b T ■ x*, i.e., the maximal value of the 
objective function is called optimal value and denoted by q* . There is always a vertex at which the 
optimal value is attained. 

An important notion of linear programming is duality: the above linear program is called the primal 
problem. From this linear program, another linear program can be derived, defined by 



min: c T ■ A 
s.t. A T -X = b 
A > , 



this problem is called the dual, its optimal solution is denoted by A* and its optimal value by d* . 
The weak duality theorem says, that the value of the primal objective function for every feasible x 
is smaller or equal to the value of the dual objective function for every feasible A. The strong duality 
theorem says that the two optimal values are equal, i.e., q* = d* . It is therefore possible to solve a 
linear program either by solving the linear program itself, or by solving its dual. 




C Explicit Values of the Linear Program for a Single Box 



In this section, we give the explicit expressions for the parameters of the linear program described 
in Section [5] for the case of a single box. 

For a single box, A, b, c have the values 
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and the dual optimal A is 

Ai T = ( 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1 1 1 1 ) 



To obtain the value of the objective function (c • A*), the first part of A^ will be multiplied by 
0, i.e., does not contribute to the value. The second part is multiplied with Pxy\uv- We can easily 
see by comparison that for every x,y,u,v such that x@y ^ u- v there is exactly one 1 in the second 
part of A^ and everywhere else Af is 0. I.e., 

c T -\\= Pxy\uv( x , V, u i v ) 

x,y,u,v:x(By^u-v 

The above values are for the input u, v = 0, 0. The optimal A* reaching the same value for 
different u, v are given below: 
For u, v = 0, 1: 

bi = (0 1 1 -1 -1 0) 
\\ T = ( 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1 1 1 1 ) 

For u,v = 1, 0: 

bi = (0 1 1 -1 -1 0) 
Ai T = ( 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1 1 1 1 ) 

For u, v = 1, 1: 

bT = (001100 -1 -1 00000000) 
Ai T = ( 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1 1 1 1 ) 



D Parameter Estimation 



A crucial step of any quantum key distribution protocol is parameter estimation. Alice and Bob 
need to test a small sample of the boxes they have received, to see whether they have received boxes 
with the correct parameters. This can be done by classical sampling theory, as given in |27j (see 
also [21]). 



Lemma 21. |27j,|24j Let Z be an n-tuple and Z' a k-tuple of random variables over a set Z, with 
symmetric joint probability Pzz 1 ■ Let Q z > be the relative frequency distribution of a fixed sequence 
z' and Qt z z i\ be the relative frequency distribution of a sequence (z,z'), drawn according to Pzz 1 ■ 
Then for every e > we have 

Pzz>[\\Q {z , z <)-QA\>e]<\Z\-e- k£2/ ^ 

In our case, we consider the case when Alice and Bob share n + k boxes. After they have used the 
boxes and announced the inputs, they randomly choose k of the boxes, for which they also uncover 
the outputs. Call e meas the fraction of those k boxes which x © y ^ u ■ v. We call e the average error 
of the remaining boxes. Then, 

Ps[\\^-r(e - e meas )\\ > e] < 2 • e^ 16 

Ps[e>e meas (l + -)-e]<2-e- k£2 / 16 
n 

Obviously, Alice and Bob can also test other parameter such as d — the correlation of their output 
bits given input (0, 0) — in a similar way. 

This means, if the boxes Eve has distributed are not good enough for key agreement, Alice and 
Bob will most certainly detect this. If they are good enough, then Alice's and Bob's test will most 
certainly be passed and key agreement is possible as discussed above. 

E Depolarization 

Assume Alice and Bob share an arbitrary distribution i"xY|uv where X, Y,U, V is an n-bit string. 
Then they can perform a sequence of local operations and public communication in order to obtain 
a distribution which corresponds to the convex combination of n independent approximations of a 
PR box with error £j. Further, each approximation of the PR box PxMUiVt nas unbiased outcomes 
and the same error £j for all inputs. The local operations achieving this, are given in |30|31| . We 
restate them here briefly: For each i, Alice and Bob choose the mapping independently in two steps. 
First, with probability 1/2, they do either of the following: 

1. nothing 

2. both flip their outcome bits, i.e., X{ — > X{ © 1 and yi — > yi © 1 . 
Then, with probability 1 /4 each, they do either of the following: 

1. nothing 

2. Xi — > Xi © Ui and Vi — > v i © 1 

3. Ui ->• Ui © 1 and yi -> yi © V{ 

4. Ui -> Ui © 1, Xi ->• Xi © Ui © 1, Vi — >■ Vi © 1 and yi -> yi®Vi . 

The choice of local operation needs 3 random bits per box which have to be communicated from 
Alice to Bob. Because, each of these operations conserves the probability of error Si a box with the 
same error parameter — but now an unbiased one with the same error for all inputs — is obtained. 
Furthermore, when this transformation is applied to each input/output bit of a distribution Pxy\uv 
taking n bits input and giving n bits output, a convex combination of products of independent and 
unbiased approximations of PR boxes (with possibly different error £j) is obtained. 

F Eve Can Always Know a Certain Fraction of Bits 

Can Eve really do collective attacks which are better than individual ones? In this section we show 
that this is indeed possible and give a collective attack for the case when Alice and Bob share n 
boxes with error e and such that the error is the same for all inputs^ We show that for every value 
of e there exists an attack of Eve such that she knows with certainty a fraction of all the output 
bits of Alice — an option unavailable if only individual attacks are allowed. What fraction Eve can 
know depends on the value of e. 

16 In the following, we will only consider unbiased boxes with the same error for all inputs. The box is, therefore, fully 
characterized by its error e. 



F.l Example of a Better Collective Attack for Two Boxes 



We first describe an example of an attack on two boxes. We will give an explicit strategy of Eve 
(a box partition) which shows that she can know either one of the two bits with higher probability 
than what can be done by an individual attack (although Eve cannot choose which one of the two 
bits she will get to know). This shows that collective attacks are strictly stronger than individual 
attacks. In fact, assume Alice will communicate to Bob the XOR of her two output bits in the 
information reconciliation phase. In that case only the probability that Eve knows at least one of 
the two bits is important, because together with the information of the XOR this immediately gives 
her full information about both bits. 

Before we can give the box partition, we need to proof the following Lemma. 

Lemma 22. Every box with e € [1/4,3/4] is local and can be expressed as the convex combination 
of local deterministic boxes. We use the short-hand notation L £ for these local e -boxes. 

Proof. According to Lemma 1201 a box with e = 0.25 is local and can be expressed as convex com- 
bination of local deterministic boxes. A local box with e = 0.75 can be obtained from the one with 
e = 0.25 by flipping one of the output bits. Every box with e E (1/4,3/4) can then be expressed as 
a convex combination of the above boxes and is therefore local. □ 



We have already seen that if a box can be expressed as convex combination of local deterministic 
boxes, then there exists a box partition (where the elements are exactly the local deterministic 
boxes) such that knowing the inputs, the outputs are completely determined. CHSH-game with a 
local box, we will now see that in our example the bad (local) strategies are important. 

Eve's strategy is given by Lemma [23 Note that the local boxes with the largest error play an 
important role here. Eve's outcome Z composed of two symbols, such that the first describes the 
first box given outcome Z = z and the second symbol the second box. More precisely, we use Zi = I 
for an outcome given which box i is local and Z{ = _L for an outcome given which box i is a PR box. 

Lemma 23. Assume a (2 • 2 + 1) -partite non-signaling distribution Pxyz\uvw suc h that Pxy]uv 
corresponds to two independent boxes with P(X{ ®Yi = Ui ■ Vi) = 1 — e for all inputs and i = 1,2. 
Then the following is a box partition: 
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and PjIy^uv := ^XiY^UiVi ' ^x^^v^ an< ^ w ^ iere stands for a box with error e and NL for a PR 
box. 

Proof. To see that this defines a box partition, let us first see, that all boxes given outcome z are 
non-signaling between all four input /output ends. This is obviously the case, because each of the 
two boxes given outcome z is non-signaling and the double-box given outcome z is given by the 
product of the two individual boxes. 

Now let us see that the marginal is correct. For this, we need to verify that the distribution of the 
output bits on each side is correct, but also that the probability that any subset of boxes fulfills 
the CHSH condition needs to be correct. The first condition is fulfilled because all output bits are 
uniform, independent and random even given outcome z. Now let us see that the probability to 
fulfill/ violate the CHSH condition is also correct. The probability that both boxes violate the CHSH 
condition is given by the probability to obtain z\Z2 = 11 (both boxes are local) times the probability 
that they then violate the CHSH condition. (If either of the boxes given outcome z is a non-local 
box it never violates the CHSH condition, therefore no other outcomes z have to be considered.) 

P(A, © Y % + U { ■ V t for i = 1, 2) = / • e z b ^f ■ e z b ^= U = (4/3e) 2 • (3/4) 2 = e 2 , 



where e? 1 denotes the error of the first box given outcome Z = z. Similarly, we can also show that 
the probability of the first box violating the CHSH condition is correct: 

P((*i ®Y 11 LU 1 -V l ))= p^ 2 ■ e z b ^f + p lx ■ e z b ^= l± 

= (4/3e) 2 .(3/4) + (4e)(l-4/3e).(l/4) = e , 

and the same for all other subsets of boxes. This shows that the marginal -Pxy|uv ^ s unchanged by 
this box partition. □ 



From this box partition, we directly obtain as a corollary: 

Corollary 4. Assume a (2 • 2 + \)-partite non-signaling distribution Pxyz\uvw such that Pxy\uv 
corresponds to two independent boxes with P(Xi ®Yi = Ui ■ Vi) = 1 — e for i = 1,2. Then there 
exists a box partition w such that the probability that Z gives binary erasure information (knowing 
U = u, V = v) about at least one of the two output bits X\, X2 is (4e) 2 + 2 • (4e)(l — 4/3e). 

This is larger than (4e) 2 + 2 • (4e)(l — 4e), the value obtained by the best individual strategy. 



F.2 Better Collective Attack for Any Number of Boxes and e 



We now give a generalization of the above strategy to attack two boxes to any number of boxes. 
The attack obtains knowledge about a fraction of the output with certainty and independently of 
the total number of boxes. Which fraction can be known depends on the error of the boxes. 

Lemma 24. Assume a (2n + l)-partite non-signaling distribution Pxyz\uvw such that Pxy\uv 
corresponds to n boxes with P{Xi © Yj = Ui ■ V) = 1 — e for i = 1, . . . , n. Then the following is a 
box partition: 



z 


pZ 


pz 

r XY\ UV 


{z\W = ie 


[2,n]} 


(4/3e) i (l -4/3e) n - 1 




._ 3/4 y ■ (NL) n ~ i 


{z\W = 


1} 


(4e)(l -4/3e)"- 1 




=1/4 • {NL) n - 1 


{z\U = 


0} 


1 ~ J2z\$i>iP z 




{NL) n 



(16) 



where Z is composed of n symbols ({l,-L} n ) and we write {jZ = i for a z which contains i symbols I. 

The proof is analogue to the proof of Lemma [531 From the above box partition, we obtain the 
following lemma. 

Lemma 25. Assume a (2n + l)-partite non-signaling distribution Pxyz\uvw such that Pxy]uv 
corresponds to n independent boxes with P(Xi Y{ = U ■ Vi) = 1 — s. Then, whenever e > 8 , 
there exists a box partition w such that for all outcomes z Pxy\uv ^ s suc ^ ^hat at least one of the n 
boxes is fully local. 



Proof. We use the box partition given in Lemma 1241 The probability to obtain an outcome Z such 
that at least 1 of the n boxes given Z is fully local can be expressed as 



{z\U=i>l} 



E 



(4/3e) n - l (l - 4/3e) 1 + ' (4e)(l - 4/3e 



n 



\n-l 



Because of the binomial formula, this probability is equal to 1 whenever 



n- (4e)(l -4/3e) 



n— 1 



n ■ (4/3e)(l - 4/3e) n ~ 1 + (1 - 4/3e) 



□ 



Therefore, whenever e > g , Eve can know 1 of the n bits with certainty. Or said differently, Eve 
can know roughly a fraction of / = 1/n = J^a > 8e/3 of the bits with certainty. 



G Proofs 

G.l All Non-Signaling Conditions 

In this section, we show that Condition [TJ implies the non-signaling condition between all possible 
subsets of interfaces of the box. 

Lemma 26. Assume a system Pxyz\uvw such that 

Pxyz\uvw( x , V, z , u, v, w) = J2 X p xyz\uvw( x , y, z, u',v, w) Vy, z, v, w 

P XYZ\UVwi x , V, z , u , v , w ) = J2 y P XYZ\UVw( x , Vi z , u , v ', w ) ^ x , z , u , w 

J2 Z p xyz\uvw(x, y, z, u, v, w) = p xyz\uvw( x , V, z, u, v, w') Vx, y, u, v 
Then it also holds that 

Y] p xyz\uvw(x, V, z , u, v,w) =Y] p xyz\uvw( x , V, z, u',v', w) Vz, w . 



Proof. 



. p XYZ\uvw( x ,y,z,u,v,w) = V V P X YZ\uvwi x ,y,z,u,v,w) 

xy x y 

J2 p XYZ\uvw( x ,y,z,u,v' ,w) = V V PxYZ\uvwi x ,y,z,u,v',w) 

x y y x 

^2 X) p XYZ\uvw{ x ,y,z,u',v',w) p XYZ\uvw{ x ,y,z,u',v',w) 



□ 



G.2 Distance of Set given other Sets 

The following lemma is used in the proof of Lemma [16l 

Lemma 27. Assume random bits Si, . . . , S k . If S k is uniform given all linear combinations over 
GF{2) of Si, . . .,S k -i, i.e., p s k \® ieI (0) = P Sk \® ieI {l) for all I C {1, . . . ,k-l}, then S k is uniform 
given Si,..., S k _i, i.e., P Sk \S!...,S k ^) = p S k \S!.l,S k ^) ■ 

Proof. We proof the case k = 3, the general case follows by induction. We have to show that 
if Ps s \s 1 , p s s \s 2 an d -f > S 3 |S'ieS2 are uniform, then Ps 3 \Si,S 2 ^ s uniform. Consider the probabilities 
p Si,S 2 ,S 3 - Because Ps 3 \Si ^ s uniform, we obtain the constraints 

P Sl ,5 2 ,5 3 (0,0,0) + p Sl ,s 2 ,s 3 (0,1,0) = P Sl ,s 2 ,s 3 (0,0,1) + P Sl ,s 2 ,s 3 (0,1,1) (17) 
P Sl ,s 2 ,s 3 (lA 0) + P Sl ,s 2 ,s s (l, l, 0) = P Sl>S2 ,s 3 (lA l) + P Sl ,SiMh 1. 1) • 

Because Ps 3 \s 2 is uniform, 

P 5l ,s 2 ,s 3 (0,0,0) +P Sl ,s 2 ,s 3 (1,0,0) =P Sl ,s 2 ,s 3 (0,0,1) +P Sl ,s 2 ,s 3 (1,0,1) 

P Sl ,s 2 ,sM 1,0) + P Sl ,s 2 ,s 3 (h 1,0) = P Sl ,s 2 ,sM 1, 1) + Ps 1 ,s a ,s a (l, 1, 1) • (18) 

And from the fact that p s 3 \Si®s 2 ^ s uniform, we obtain 

P S 1>S2 ,S 3 (0,0,0) + P Sl ,S 2 ,S 3 (l, 1,0) = P Sl ,s 2 ,S 3 (0,0, 1) + P Sl ,S 2 ,S 3 (l, 1, 1) (19) 

Ps u s 2 ,s 3 (0, 1, 0) + P Sl ,s 2 ,s 3 (h 0,0) = P Sl ,s 2 ,sM 1, 1) + P S i,s 2 ,s 3 (l,0, 1) . 
Then substract (|18|) from (|17p and add (|19p to obtain 

2 ■ P Sl ,s 2 ,S 3 (0,0,0) = 2 ■ P Sl! s 2 ,s 3 (0Al) , (20) 

which implies 

P {n \_ p s u s 2 ,s 3 (0,0,0) _ p m r „ n 

p S3 \ Sl =o,s 2 =o{0) - p 5i)S2jS3( o,o,0) + P SljS2)S3 (0,0,1) " P W=o,5 2 =o(l) • (21) 

Uniformity for all other values of Si, S2 then follows directly from the above equations. □ 



G.3 Linear Combination of Random Vectors 



The following lemma is used in the proof of Lemma [TrJl 

Lemma 28. Assume u and v are n-bit vectors and Pjj is the uniform distribution over all these 
vectors. Define the vector w = u(Bv. Then w is again distributed according to the uniform distribu- 
tion. 

Pu^PuPv^Puiu® v) = P w ^ Pu (w) . 

Proof. The uniform distribution over all n-bit vectors can be obtained by drawing each of the n-bits 
at random, i.e., P(0) = P(l) = V^- The XOR of two random bits is again a random bit, i.e., 
P(0) = .P(l) = 1/2 and therefore, w is also a vector drawn according to the uniform distribution 
over all n-bit vectors. □ 



G.4 Average Epsilon 

The following lemma is used in the proof of Lemma [TBI 

Lemma 29. Assume a variable Si for i = 1, . . . , n with average e = ^ ■ ^ £j. Then 

EII^EIT 

KCni&K KCni£K 

Proof. We will show that when replacing £i and £2 by their average, the value of the above expression 
only gets bigger. The lemma then follows by repeating to combine £j in pairs and replacing them 
by their average. First note that replacing £1 and £2 by £l \ 62 each does not change the average 
e. Now calculate X^A'Cn EL'eRT^i)- F° r this, divide the sets K into different categories: The ones 
which contain neither £1 nor £2, which we call K®; the ones which contain either £\ or £2 which we 
call K 61 (K £2 ); and the ones which contain both E\ and £2 called K Sl£2 . 

y n e * = y n e * + yi n ^ + yi n e » + y n = ^ + ei + ^ + ^ y n ^ 

KCni&K K® i&K K £ i ieK K e 2 ieK K e i e 2 ieK k 9 i&K 

When replacing £1 and £2 by £l jj" £2 each, clearly YIk® TiieK e * stays the same and the value of 
1 + £1 + £2 + £i£2 only becomes larger because £i£2 < ( £l ^ £2 ) 2 - □ 
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